Optimizing Initial Configurations of Neural Networks for the Task of Natural Language Learning

نویسنده

  • Jaime J. Dávila
چکیده

One approach used to develop computer systems for natuidentifying phrases, e.g. is a noun phrase; and (3) ral language processing (NLP) is that of Artificial Neural Netidentifying the relationships among phrases, e.g. is works (NNs). Because of the large number of parameters a the agent of . This is similar to Jain (1991). NN has (e.g. network topology, learning algorithm, transfer The NN has 75 hidden nodes, divided into between 1 and functions) and the way they interact, finding optimal parame30 layers, as determined by the GA. The GA also determines ter values for any particular task can be extremely difficult. how these layers are connected to each other, and which Topology can greatly affect the performance of a NN. learning algorithm and transfer functions to use. Stolcke (1990) found that simple sentences could be proceThe NN is tested on a language of 508 sentences with three ssed by NNs with one hidden layer. Performance degraded, different complexity levels. The basic training set is 20% of however, when the NN was presented with embedded sententhe complete language. The GA makes three major decisions ces, for which more complex topologies were needed. What about training: (1) whether to train with sentences from all topology to use for any particular task is still an open quescomplexity levels at once, or with sentences from the simpler tion. Most decisions regarding NN topology are based on idelevel first and more complex sentences later, (2) whether to as of how the problem should be tackled. Jain (1991) used a train with sentences from all complexity levels in the same NN that first decomposed a sentence into phrases and later proportion, or with more sentences from one level or another; defined relationships between the phrases. Miikkulainen (3) whether the training set should be increased past the basic (1996) divided a NN into a parser, a segmenter, and a stack. 20% . If it is, the NN's fitness is decreased by a factor equal These NN vary in the number of layers, the number of nodes to the increase in the training set. in each layer, and how these layers are connected to each By studying the configurations chosen by the GA, I hope to other. identify which parameters are critical for the NLP task and Another aspect affecting performance is the corpus used to which values for these parameters produce the best perfortrain the NN. Nenov and Dyer (1994) found that training with mance. These results will provide a better understanding of individual words before showing sentences increased perforNN behavior and point to improved NN configurations. mance. Elman (1993) found that training a NN with simple sentences first and complex sentences later produced better results than presenting all sentences in one session. Although researchers have been successful in using NN for NLP, choosing an initial configuration is quite complex. An automated process that can find an optimal set of parameters would be helpful in designing such a system. In my research I use Genetic Algorithms (GAs) to find what NN parameter values produce better performance for a particular NL task. There is one input node for each word in the vocabulary. A sentence is presented to the NN one word at a time by activating the corresponding node at the input layer. The NN incrementally generates a description of the input by correctly identifying the parts of the sentence and how they relate to each other. For example, in the sentence the NN should respond by (1) identifying each word in the sentence, e.g. is a movement verb; (2) References

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Integration of Deep Learning Algorithms and Bilateral Filters with the Purpose of Building Extraction from Mono Optical Aerial Imagery

The problem of extracting the building from mono optical aerial imagery with high spatial resolution is always considered as an important challenge to prepare the maps. The goal of the current research is to take advantage of the semantic segmentation of mono optical aerial imagery to extract the building which is realized based on the combination of deep convolutional neural networks (DCNN) an...

متن کامل

A Hybrid Optimization Algorithm for Learning Deep Models

Deep learning is one of the subsets of machine learning that is widely used in Artificial Intelligence (AI) field such as natural language processing and machine vision. The learning algorithms require optimization in multiple aspects. Generally, model-based inferences need to solve an optimized problem. In deep learning, the most important problem that can be solved by optimization is neural n...

متن کامل

A Hybrid Optimization Algorithm for Learning Deep Models

Deep learning is one of the subsets of machine learning that is widely used in Artificial Intelligence (AI) field such as natural language processing and machine vision. The learning algorithms require optimization in multiple aspects. Generally, model-based inferences need to solve an optimized problem. In deep learning, the most important problem that can be solved by optimization is neural n...

متن کامل

Transparent Machine Learning Algorithm Offers Useful Prediction Method for Natural Gas Density

Machine-learning algorithms aid predictions for complex systems with multiple influencing variables. However, many neural-network related algorithms behave as black boxes in terms of revealing how the prediction of each data record is performed. This drawback limits their ability to provide detailed insights concerning the workings of the underlying system, or to relate predictions to specific ...

متن کامل

Numerical solution of fuzzy differential equations under generalized differentiability by fuzzy neural network

In this paper, we interpret a fuzzy differential equation by using the strongly generalized differentiability concept. Utilizing the Generalized characterization Theorem. Then a novel hybrid method based on learning algorithm of fuzzy neural network for the solution of differential equation with fuzzy initial value is presented. Here neural network is considered as a part of large eld called ne...

متن کامل

Improving the Performance of Machine Learning Algorithms for Heart Disease Diagnosis by Optimizing Data and Features

Heart is one of the most important members of the body, and heart disease is the major cause of death in the world and Iran. This is why the early/on time diagnosis is one of the significant basics for preventing and reducing deaths of this disease. So far, many studies have been done on heart disease with the aim of prediction, diagnosis, and treatment. However, most of them have been mostly f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998